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Abstract 

We present a geometrical method for analyzing sequential estimating procedures. 
It is based on the design principle of the second-order efficient sequential estimation 
provided in Okamoto, Amari and Takeuchi (1991). By introducing a dual confor- 
mal curvature quantity, we clarify the conditions for the covariance minimization 
of sequential estimators. These conditions are further elabolated for the multidi- 
mensional curved exponential family. The theoretical results are then numerically 
examined by using typical statistical models, von Mises-Fisher and hyperboloid 
models. 

Keywords and phrases: Affine connections, Curved exponential family, Hyperboloid dis- 
tribution. Information geometry. Projective transformation, Riemannian metric. Space of 
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1 Introduction 



Sequential estimation continues observations until the observed sample satisfies a certain 
prescribed criterion. Its properties have been shown to be superior on the average to 
those of nonsequential estimation in which the number of observations is fixed a priori. 
Specifically the developments of higher-order asymptotic theory have suggested that the 
information loss due to the exponential curvature of the statistical model might be re- 
covered by a sequential estimation procedure which makes use of the ancillary statistic. 
Such an estimator is expected to have a uniformly better characteristic on the average 
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(see e.g. S0rensen (1986)). Takeuchi and Akahira (1988) formulated this scheme rigor- 
ously and analyzed the higher-order efficiency of sequential estimation procedures in the 
scalar parameter case (see also Akahira and Takeuchi (1989)). They showed that in the 
sequential case the exponential curvature term in the second-order variance can be elim- 
inated by a second-order efficient estimator, and the maximum likelihood estimator with 
an appropriate stopping rule gives such a sequential estimator. This also implies that 
appropriately designed sequential estimators are superior to nonsequential estimators in 
the asymptotic sense. 

Following the work of Takeuchi and Akahira (1988, 1989), Okamoto, Amari and 
Takeuchi (1991) generalized the results to the multiparameter case by using the geometri- 
cal method, and studied characteristics of more general sequential estimation procedures. 
In the nonsequential statistical manifold is uniformly enlarged by N times when 

we use N observations, keeping the intrinsic features of the manifold unchanged. In a 
sequential estimation procedure with a certain stopping rule, the observed sample size N 
is a random variable depending on the position of a statistical manifold. This causes a 
nonuniform expansion of the statistical manifold. Such an expansion is called the confor- 
mal transformation in geometry, since it changes the scale locally and isotropically but 
it does not change the shape of a figure (it docs not change the orthogonality). The 
result of Takeuchi and Akahira can be interpreted such that it is possible to reduce the 
exponential curvature of a statistical manifold to zero by a suitable conformal transfor- 
mation. The conformal geometry thus is an adequate framework for the analysis of the 
sequential inferential procedures if we extend the concept of the conformal transformation 
to the statistical manifold which is the Riemannian manifold with a dual couple of affine 
connections. 

As a sequel to Okamoto, Amari and Takeuchi (1991), this paper investigates the se- 
quential estimating procedures from the information geometrical viewpoint. The novelty 
of this paper is the introduction of the dual conformal Weyl-Schouten curvature of a 
statistical manifold, and this quantity will be proved to play a central role when consider- 
ing the problem of covariance minimization under the sequential estimating procedures. 
Information geometry was originated with the work of Amari (1985), and it has been 
establishing a solid status as a mathematical methodology for a variety of statistical sci- 
ences (see e.g. Amari et al (1987), Amari and Nagaoka (2000), Kumon (2009, 2010)). In 
line with these developments, the present paper also intends to provide a starting point 
for studying the conformal geometry of a statistical manifold itself succeeding to the work 
of Lauritzen (1987). 

The organization of the paper is as follows. The established known results are cited as 
propositions, and the results obtained in this paper are stated as theorems. In the next 
section, we prepare some statistical notations and preliminary results which will be rele- 
vant in this paper. In Section 3, we formulate a conformal transformation of a statistical 
manifold, where a set of dual Weyl-Schouten curvature tensors is introduced. Then we 
elucidate their implications in the structures of statistical manifolds. In this connection, 
the meaning of conjugate symmetry is also explained, which is the notion first introduced 
by Lauritzen (1987). In Section 4, the general result in the previous section is used to 
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write out the structure of multidimentional exponential family. In Section 5, the general 
result is used to delineate the structure of multidimensional curved exponential family, 
where the dual Euler-Schouten curvatures are introduced. Then related notions such as 
totally exponential umbilic and dual quadric hypersurface are shown to involve key ele- 
ments in studying statistical submanifolds. In Section 6, the geometrical results obtained 
in the previous sections are applied to the sequential estimation in a multidimensional 
curved exponential family, where we give a concrete procedure for the covariance mini- 
mization. In Section 7, the results in Section 6 are numerically examined by two typical 
curved exponential families called the von Mises-Fisher model and the hyperboloid model. 
Section 8 is devoted to some additional discussions and a perspective of future work. 



2 Preliminaries 

Let us denote by X{t) = {Xi{t), . . . ,Xfc(t))* a fc-dimentional random process defined on 
the probability space [Q,J^,P] with values in where E = M.^ and £ is the a-field 

of all Borel sets in E. The time parameter t ^ T runs over all non-negative integers 
T = {0,1,2,...} or over all non-negative real numbers T = [0, -|-cxd). Moreover, let 
J-f, t E T denote the cr-field generated by the random vectors X{s), s ^t. 

We assume that the probability measure P depends on an unknown parameter 6 = 
{6^, . . . , ^™)* G O, P = Pe, where G is homeomorphic to M™, and we shall consider the 
case where the following conditions are fullfiled: 

(i) X{t) is continuous in probability, has stationary independent increments and 

Peix{o)) = 0) = 1, e e. 

(ii) The probability distributions at any time t are dominated by a a-finite measure ii 
and have the densities f{x,t,6) with respect to fi 

dP 

—^{x, t) = f{x, t,e), xe E, teT, e eQ. (l) 

CfctX 

We say that {Pg | 6 G 0} is an m-dimensional full regular minimally represented expo- 
nential family (f.r.m. exponential family) when the densities ([T]) can be written as 

Ux,t,e) = exp{e'x,-^{e)t}, (2) 

where x = (xi, . . . , x^)' G M™, 6 is the natural parameter, and rj = dilj{6)/d6 is the 
expectation parameter with ip{6) a smooth (infinitely differentiable) convex function of 
6. In the right-hand side of ([2]) and hereafter the Einstein summation convention will be 
assumed, so that summation will be automatically taken over indices repeated twice in 
the sense e.g. 9'^Xi = Yl^i G^Xi. 

Sequential statistical procedures are characterized by a random sample size, where 
stopping times are used to stop the observations of the process. We denote by r an 
arbitrary stopping time, i.e., a random variable r defined on Vt with values in T U {oo} 
and possessing the property {u E VL : t{uj) ^ t} G J-^, Vt G T. We consider the case to 
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which the Sudakov lemma apphes, where the stopped process (r, X(r)) has the densities 
/(x, t, 9), and these can be regarded as the same functions in ([1]) (cf. e.g. Magiera (1974)). 

A typical statistical problem is the unbiased estimation of a given m-dimensional real 
vector valued function h{9) = {hi{9), . . . , hm{9)) of the parameter 9 by using observations 
on the X{t), t E T. An estimating procedure for h{9) is defined by a pair (r, Z{t, X(r))), 
where r is a stopping variable and Z{t,X{t)) = (Zi(r, X(r)), . . . , ^^(t, X(r))) is a 
M™- valued function defined on T x E, which is an unbiased estimator of h{9), i.e., 

Eg[ZiT,XiT))] = h{9). 

Let us look at all estimation procedures satisfying the following regularity conditions: 
(i) The h{9) gives a smooth one-to-one transformation from Q to H = h{Q) in the sense 

rank Cia{9) = rank — ^ = m, V6' G 0. 
09^ 

{u)Ee[T]<oo, \Ee[Zo,{T, X{t))Z^{t, X{t))]\ < oo, WeQ, a, /5 = 1, m, and the re- 
lation Eg[Z{T, X{t))] = h{9) can be differentiated with respect to 9 under the expectation 
sign. 

Then we have the so-called Cramer-Rao inequality for the covariance matrix of the 
unbiased estimators. 

Proposition 2.1. If Z{t, X{t)) is an unbiased estimator ofh{9), then the the covariance 
matrix of Z{t, X{t)) is bounded below as 

Eg[{Z{T,X{T)) - h{9)){Z{T,X{T)) - h{9)y] > G{h{9))-\ (3) 

where 

G{h{9)) = \r\K9))l r\K9)) = Ee[d%d%] = C'''C^^g,,{9), 

^"X/^) = ^, -9iM=EeWrdilrllr=\0gf{X{T),T,h{9)), 5" = ^, 9, = ^, 

and for symmetric matrices A and B, the inequality A> B implies that A — B is positive 
semi- definite. The equality in ^ holds if and only if Z{t, X{t)) can be represented almost 
everywhere as 

Z{r,X{r)) = h{9) + G{h{9))-'dH{X{r),r,h{9)). d^l= (^'•••'^J- (4) 

The condition (jlj) is also written as 

dH{X{r),T,h{9)) = G{h{9)){Z{r,X{r)) - h{9)), 
or in component form 

dH{XiT),T,hi9)) = r^{h{9))Zp{T,X{T))-r^{h{9))hp{9) = k^{X{T),T,h{9)). 
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The above is a partial differential equation for 1{X{t),t, h{6)), of which integrability 
condition d^k"' = 5"^^ is 

[d^r'{h{e)) - d"f^{h{dmZ,{r, X{r)) - h,{e)) = 0, 

and hence the requirement for integrability is 

= d'^f^'ih) ^ 3(f){h) smooth and convex in h such that g'^^ih) = 9"(9^0(/i). 

In this case, the log likelihood function l{x,t, h{6)) is expressed as 

T{x,t,h{e)) = i{x,t) + Cza{t) - (5) 

which implies that {f{x, t, 6)} must be a f.r.m. exponential family with ^ and h the natual 
and the expectation parameters, respectively. 

Suppose that the original {f{x,t,6)} is not a f.r.m. exponential family, then clearly 
{f{x,t,6)} is not a f.r.m. exponential one, either. Hence we can restrict attention to the 
case ([2]), when considering the attainment of the lower bound given by Proposition 1.1. 
However we should remark that it is only a necessary condition for the attainment of the 
lower bound. In fact even in the f.r.m. exponential family, only some restricted cases can 
exactly attain the lower bound due to the problem of the "overshooting" at the efficient 
stopping times (see Ghosh (1987)). 



3 Conformal transformation of statistical manifold 

Let M = {f{x, 1,6) \ 6 E 0} be an original family of probability densities of unit time, 
where is homeomorphic to W^. The family M can be regarded as a statistical manifold, 
where the m-dimensiomal vector parameter 6 serves as a coordinate system to specify a 
point, that is, a density f{x,l,d) G M. The geometry of M is determined by the following 
two tensor quantities (cf. Amari (1985), Amari and Nagaoka (2000)) 

g,,{e) = Eg[ddid,h], T,^ki0) = Ee[dihdjhdkh], /i = log /(x, 1, ^), = Jj- 

The first is the Fisher information metric and the second is called the skewness tensor. 
One parameter family of affine connections named the a-connection is defined by 

rS;i(^) = Ee[d.d,hd,h] + i^T,,.(e), 
and then the a-Riemann-Christoffel curvature tensor is given by 

p(") (Q\ — ft r(°) _ fl r(") -L n^s/-p{a)-p{a) _ -p{a)-p{a)^ 
^ijklV') ~ jkl '-'j^ ikl ' y K'- ikr^ jsl jkr^ isl )■ 
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The a- and the (— Q;)-connections are mutually dual 
and the ±a;-RC curvature tensors are in the dual relation 

p(a) _ _ f?(-a) 

Let M = {/(x, t, 0) I ^ G 0} be an m-dimensional extended statistical manifold under 
a sequential statistical procedure. From the Wald identity the metric and the skewness 
tensors of M are given by (see Akahira and Takeuchi (1989)) 

gij{Q) = lygij, fijk{6) = v[Tijk + ^(ijSk)], (6) 
v{0) = Eg[T], Sk{0) = dk \ogu{9), Sg^ijSk) = gijSk + gjkSi + gkiSj. (7) 

These relations show that a sequential statistical procedure induces a conformal trans- 
formation M I— )■ M by the gauge function v{9) > 0. The conformal transformation of a 
Riemannian manifold implies that the manifold is expanded or contracted isotropically 
but that an expansion rate depends on each point. Our transformation is a statistical 
counterpart of this one. A conformal transformation changes the a-connection into 



^ijk = '^[^ijk + -^{9kiSj + gkjSi) ^gijSk], (8) 

rS?' = r^ig^' = T^' + ^(5f + S^s.) - (9) 



This is obtained by substituting (E]) into 

^ -^(0) _ 
ijk ijk 2 ^j^'' 

and by noting that T^^l is the conformal change of the Riemannian connection Then 
a conformal transformation changes the a-RC curvature tensor into 

^ijki - ^[^ijki- 9uSjk + 9jiSik -9jkSii + gikSji J, (iUj 

^ijk - ^ijkr9 - ^ijk ~^i^jk +(^jSik - 9jkSir 9 + gikSjr g , (iij 



sff = ^^[Vr^^, - ^-^s,s, + ^-±^g.,sks,g'l V^s, = d.s, - vf^' Sk- (12) 



This is obtained by substituting (jH]) into 

p(") _ f) r(") _ P) -pW _|_ 7,r'S|'-p(a)-p(a) _ ■p(o)-p(a)\ 
'^ijkl ~ jkl ikl '9 IJ- ikr^ jsl ^ jkr^ isl )■ 

We note that under a conformal transformation the mutual duality of ±a-connections is 
preserved 

dm = ^^ + ^tf^ (13) 



6 



and also the dual relation of the ±a-RC curvature tensors is preserved 



R^'ijli - -Rijik- (14) 

These are confirmed by the direct calculations with (Q, ([8]) and f lTO|) . 

One of the concerns about the conformal transformation is whether a given manifold 
can be transformed into a desirable space in some sense. The main objective from the 
geometrical viewpoint is the flatness or the straightness, and it has been investigated 
usually in terms of the Riemannian connection. From the statistical viewpoint, the main 
objective is the flatness or the straightness in terms of the mutually dual ±l-connections. 
Thus we say that a statistical manifold M is conformally mixture (exponential) flat when 
there exists a gauge function ^{6) > such that R'ljf^^'' = (-R^j],' = 0) holds. Note that 
by f|T^ M is conformally mixture flat if and only if M is conformally exponential flat. 

In view of these observations and also the work of Okamoto (1988), we introduce the 
set of (— l)-Weyl-Schouten curvature tensors as follows. 

Definition 3.1. 

w't"(«) = 4-"-ii<r", (17) 



From (11) and (12) we have 

^ 4r' = -;;7^(fl;;"-fi£") 

For the case m = 2 we can also directly check wj^^/}^'' = 0, and hence wj^^^^^ = wj^^f}\ Then 
we obtain the following result as to the conditions for the conformal mixture (exponential) 
flatness. 

Theorem 3.1. A statistical manifold M is conformally mixture flat (or equivalently ex- 
ponential flat) if and only if 

(i) Wtj}^^ = when m = dimM > 3. 

(li) Wl~^^^ = and W^^^^ = when m = dimM = 2. 



7 



For the sake of simplicity we hereafter express the notion such as conformally mixture ( or 
equivalently exponential) flat as conformally m{e)-flat. 



Proof. Consider the relation 

When Bfjk^^ — we note by the integrabihty condition that 



3sk such that V^- ^^Sk - SjSk = 7-^^^^ 



1 

m — 1 

Vj- Sfe - Vj Sk - -R-ijk si 



We first prove the necessity. Suppose that M is conformally m(e)-fiat. Then from 
R^Jk^^Rjl^^ = 0, when m > 3 wc have wt^^^^ = W^^~^^^ = 0. When m = 2 since 
Wj^fi}^^ = we have W-j'^^^ = W-jk^^ — 0, and since there exists a log gauge function 

s = log we have W^j~^^ = wj;^^^ = 0. 

We next prove the sufficiency when m > 3. From the Bianchi's second identity (cf. 
Schouten (1954), p. 147) 

and from wj^^f}^^ = we have 

v,'-"w,<;'>' = v,'-'>iil->' - -l^(vi-'fl£" - v5-"4-") = (m - ml^' = 0, 

so that Wj^~i}^ = 0. Then as noted before, there exists a covariant vector field s^. such that 
^^Sk — SjSk = From the Bianchi's first identity (cf. Schouten (1954), p. 144) 

and from the duality of R^f/^i 

p(l) , _ _ d(-1) 

-^ijkl ' ^\jkl ~ -^ijlk -^ijlk ' 

we have 
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On the other hand, from W^j^ = we have 

d(1) _ p(-l) _ _J_(n n R^'^h 

^ = RSk9^' = z;^i9uR^^'' - R^-'^ = B^g^'. 

By combining these two relations we obtain 

and hence there exists a log gauge function s = log u such that Sk = dkS. 

Finally we prove the sufficiency when m = 2. From wj^j/}^'' = 0, wj^j/}'' = 0, as noted 

above, there exists a covariant vector field Sk such that vl ^^Sk — SjSk = ^-t-RII"'^^- Then 

J Til, — J_ J ii' 

from 

= Rj'k'^ - RiT^ = d,sk - dkS, = 0, 

there exists a log gauge function s = logz/ such that Sk = dkS. 

This completes the proof of the theorem. □ 



We further investigate the implications of Theorem 3.1. Suppose that the (— 1)-RC 
curvature tensor of a statistical manifold M is expressed as 

Rijki = KoikQii - Qikgji), (19) 

where A is constant on M. In this case we have 

p(-i) _ _ d(i) _ p(-i) 

^ijkl — ^ijlk — ^ijkl — ^jilk ■ y^^J 

A statistical manifold M satisfying flTIJl) is said to be a space of constant mixture ( expo- 
nential) curvature, and M satisfying fl2U]) is said to be conjugate mixture (exponential) 
symmetric. The conjugate m(e)-symmetry is the special notion of the conjugate ia- 
symmetry introduced by Lauritzen (1987). By definition we know 

M is a space of constant m(e)-curvature ^ M is conjugate m(e)-symmetric. 

The connections among these notions are summarized in the following theorem. 

Theorem 3.2. For the conformal m(e)- flatness of a statistical manifold M, the following 
relations hold. 

(i) M is conjugate m(e)- symmetric and is conformally m(e)-flat if and only 
if M is a space of constant m(e)-curvature. 

(a) A f.r.m. exponential family Me is always conformally m(e)-flat. 
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Proof. We first prove the sufficiency of (i). Suppose tliat M is a space of constant m(e)- 
curvature. Prom 

^ijki = K9jk9ii - 9ik9ji), 



we liave 



Then we obtain 



Ri'^ = -im- l)X9jk. 



= A(V;-'^^,, - V^r'^g.^k) = (since vf ^y.^ = aT,,^), 

W^r'^ = (m - l)X{gij - g,,) = 0. 

We next prove the necessity of (i). Suppose that M is conjugate m(e)-symmetric and 
is conformally m(e)-flat. When m > 3, we have 

R]iik = R\jki = ;;5;ir3;(5'''^ifc^^ - 9jiR\k^^), 
Rii -Rjiik9 -zrzr\\9ii^' ^ ^ii ) 

lib X 

^ - 

^ ^^^'^^ " m(m - 1) ~ ^'''^^'^ 



m(m — 1) 

By substituting this expression into the Bianchi's second identity, we have 

(since gjk — —Tijk is symmetric in i,j, k) 
(m - l)(m - 2)V|.~^V = (m - l)(m - 2)9^p = 0, 
so that p is constant on M. When m = 2 we have 



V7(-1) p(-l) _ V7(-1) p(-l) p(-l) _ ZZ n 

^jk - ^ik ' ^jk - -^—9jk 



=^ diR'^ ^^gjk = djR^ ^^git (since V> ^'gjk = -Tijfc) 
^ = 0, 

and again R^^^^ is constant on M. This completes the proof of (i). 

hjkl 



Since Mg is a space of zero m(e)-curvature R^f^i = 0, (ii) is obtained from (i). □ 
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Figure 1 illustrates the relations among several notions in Theorem 3.2. 
conjugate m(e) - symmetric conformally m(e) - flat 




constant m(e) - curvature 
Figure 1: Relations among several notions on M 

4 Conformal geometry of exponential family 

Based on Theorem 3.2 (ii), we seek a concrete conformal transformation Mg i— such 
that Me is ±l-flat. When Me is a f.r.m. exponential family, it is ±l-fiat, i.e., R^fi^i = 0, 
in which the natural parameter 6 and the expectation parameter i] provide the ±l-affine 
coordinate systems of Me in the sense (cf. Amari (1985), Amari and Nagaoka (2000)) 

rl]l{9) = Eemhd,h]=0, 9, = J-, 

= EJid'dHi + d%dn^dHA = o, d' = 

or]i 

and there exist two potential functions ip{9) and 0(?7) such that 

^* = 9V(r/), ^, = diiP{d), g,^{d) = d,d,^{e), g'^{r]) = d'd^<P{r]), il^{e) + (P{r]) - d\ = 0. 

By the formula ([H]) a conformal transformation Me i— )- Me with gauge function 1/(77) > 
changes r'-~^)*'''^(?7) into 

f (-i)ufc(^) = ulgf^'s^ + g'^'s'l g^' = E^[d\d%l = log u. 

We consider a coordinate transformation from t] to h which will provide a (— l)-affine 
coordinate system of Mg- The (— l)-connection T^^^^^^''(r]) transforms to 

f (-i)a/37(/,) = B^B^BlT^~'^'^\i]) + g^'Bld'^B^, 5f = 9" = 

= z/5,^[i?r5f (/^s^- + g'^s') + g^'d'^B^], 
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and hence 



Me is m-flat ^ r(-^)"^^(/i) = 

^ 3h, V such that B'^B^Ag^'s^ + ^'^^s*) + ^^'^S^Ef = 



dhr 



^ 3h, V such that d'Cl - s'Ci - s^C'^ = 0, s^-^^'^ = d's^ - s's^ = 0, Q 



dT]i 



where s(~^)*-? = is the integrabihty condition of the first equation on the right-hand side. 
We can solve the above two partial differential equations for s{ri) = log 1/(77) ha{f]) as 
shown in the following theorem. 

Theorem 4.1. When a statistical manifold Mg is an m-dimensional f.r.m. exponential 
family with {—l)-affine coordinate system r], it is conformally m(e)-flat by the gauge 
function vijj) > and the new {—l)-affine coordinate system h given by 

\c" + c^rji] 

where c'^,c\da, D\ (z, a = 1, . . . , m) are constants, and rank Z)^ = m. 

The new 1-affine coordinate system ^, two potential functions ipiC,) and <f){h) are re- 
spectively given as 

c = d''m, h^ = dMo, m = ^ivmv), m+m-cha = o, (22) 



Proof. We first prove ( 12T]) . By putting s = — logr we have 

d's^ - s's^ = ^ d'd^r = ^ r = c° + dr]i ^ u = -= „ ^ . , 

r + c^rji 

and by putting ha = e'^y^ we have 

da + D'aTji 



d'Ci - s'Ci - S^C^ = ^ d'd^ya =0 ^ ya = da + D'aVr ^ K 



We next prove (22), (23). By the direct calculation we can confirm 

-g'-p{h) = d^d^m, m = y{v)m, 

and then the others are immediately obtained. 

This completes the proof of the theorem. □ 

We remark that vij]) and hirf) in f l2T|) cover the general solution and these are the 
same as those given in Winkler and Franz (1979), which were derived from the statistical 
considerations of the efficient sequential estimators attaining the Cramer- Rao bound. 
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5 Conformal geometry of curved exponential family 



We first introduce a curved exponential family. A family of probability densities Mc = 
{fc{x, t,u) \ u G U} parameterized by an m-dimensional vector parameter u = {u^, . . . , u"^) 
is said to be an (n, m)-curved exponential family when it is smoothly imbedded in an n- 
dimensional f.r.m. exponential family Me = {fei^, t,9) \ 9 & ©} in the sense 

f,{x,t,u)=Mx,t,9{u))=exp{9'{u)x,-i,{9{u))t}, (24) 

where U is homeomorphic to MJ^{m < n) and 9{u) = (^^(m), . . . , ^"(m))* is a smooth 
function of u having a full rank Jacobian matrix. Wc use indices i,j,k and so on to 
denote quantities in terms of the coordinate system 6' or 77 of Me, and indices a,b,c and 
so on to denote quantities in terms of the coordinate system u of Mc. 

For analyzing the geometrical properties of Mc imbedded in Me, it is convenient to 
introduce a new coordinate system w ~ (u, v) of Mg in the following manner. We attach 
to each point u G Mc an (n — m)-dimensional smooth submanifold A{u) of Mg which 
transverses Mc at 9{u) or equivalently at r]{u). We assume that the family A = {A{u) \ u & 
Mc} fills up at least a neighborhood of Mc in Mg, that is, ^4 is a foliation of the tubular 
neighborhood of Mc in Mg. Such an A(u) is called an ancillary submanifold rigging u, 
and A is called an ancillary family rigging M^. 

We introduce a coordinate system v = (f . . . , f ") to each A{u) such that the origin 
V = is at the intersection of A[u) and Mc. Then the combined system 

w — {w°') — {u"', v'^), a — 1, ... ,n, a — 1, ... ,m, k — m + 1, . . . ,n 

gives a new local coordinate system of Mg. We use indices a, /3, 7 and so on for quantities 
related to the coordinate system w, and indices k, A, /i and so on for quantities related to 

the coordinate system v. 

The basic tensors of Mg are written as 

and the a-connection is given by 

= r("'^'^''{r))Bp,B^jBsk + g'^{v)BsidpB^j = -^7>75 + {dpB^3)Bl 

R _ % Ri 

in the w-coordinate system. When we evaluate a quantity q{u,v) on Mg, i.e., at = 0, 
we often denote it by q{u) instead of by q{u, 0) for brevity's sake. The metric tensors of 
Mc and A{u) are given by 

gab{u) = BiBigij = BaiB^jg'' , gnx{u) = B^Bigij = B^iB^jg'', 
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and then indices can be lowered or uppered by using these metric tensors or their inverses 
g°'^{u),g'^^{u). The ±l-connections of are given by 

ri;i(n) = {daBi)B^,, ri;J)(n) = {O^b.M- 

We call A = {A{u) \ u G U} an orthogonal ancillary family when ganiu) = 0, Vm G U, 
and we assume this property in the following. The mixed parts T^^^^ (u) play central roles 
in the evaluation of statistical inferences, which are defined as follows. 

Definition 5.1. 

and we call H^^"^ the ±l-Euler-Schouten curvature tensors of Mc- 

The ±1-RC curvature tensors and the ±-ES curvature tensors of Mc are connected by 
the equations of Gauss (cf. Schouten (1954), p. 266) 

^abcdW - ^ijkl ^a^b^c^d + K^adK ^bc\ " ^bdfi ^ac\ >9 

- {^adK ^bc\ - ^bdK ^acX )9 ■ l^OJ 

Suppose that the ±1-ES curvature tensors of are related as 

Hili\^) = ^Hilliu), (26) 
where e (7^ 0) is a constant. In this case we have 

^ibcdi^) = ^ilL(^) = (^i^adi^bci - HbdiHal)9''^^ 

SO that Mc is conjugate m(e)-symmetric. Thus we say that Mc satisfying (!26ll is ES 
conjugate m(e)- symmetric. Suppose further that the 1-ES curvature tensor of Mc is 
written as 

H^i{u) = Hi'^g^,{u), Hl^\u) = ^H^lg'^\ e Mc, (27) 

lit 

where H^^ is called the mean 1-ES curvature of M^, and Mc satisfying fl27|) is said to be 



totally exponential umhilic (e-umhilic). 

The implications of these notions are summarized in the following manner. 

Theorem 5.1. For an {n,m)-curved exponential family Mc, the following relation holds. 

Let m>3 orn = m + l, and suppose that Mc is ES conjugate m(e )-symmetric 
and totally e-umhilic. Then Mc is a space of constant m(e)- curvature. 
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Proof. Suppose that Mc is ES conjugate m(e)-symmetric and totally e-umbilic. Then 
from fl26l) and fl27|) we have 

Rilcli^) = RalM = ^H^'^'i9ad9bc - 9ac9M), H^'^\u) = H^^^ H^^^ 

When m > 3, as noted in the proof of Theorem 3.2, eif^^^^ is constant on M^.. When 
n = m + 1, from the equation of Codazzi (cf. Schouten (1954), p. 266) 

= R^BIBIB^B^ = VWi^if - V^^Hil^-, Bt = gH9^'B\ 
e(V«if(i)'')(76c - e{V^f^^H'-'^^)gac = (since V^^^g^c = Tabc) 

e(m - l)Vi^)i/(^)'" = e(m - l)daH^^'^^ = (since H^^'^^ is a scalar on k = m + 1), 

and without loss of generality we can set gnuiu) = 1, so that eH^^^"^ = eH^^^'^H^^^'^g^i^ is 
again constant on M^. □ 

We further deal with the case of n = m + 1. Suppose that Mc satisfies the equations 

Biiu) = koie\u) - ei), B^iiu) = loimiu) - v^), 9M = 1, K = m + 1, (28) 

where ko,lo are non-zero constants and 9^,1]^ are constant vectors. In this case Mc is 
expressed as 

and we call Mc satisfying fl28|) a dual quadric hypersurface. In Section 7 it will be shown 
that the von Mises-Fisher model and the hyperboloid model are the examples of the dual 
quadric hypersurface. The meaning of this hypersurface is described in the following 
theorem. 

Theorem 5.2. For an {m + l,m)-curved exponential family Mc, the following two con- 
ditions are equivalent. 

(i) Mc is a dual quadric hypersurface. 

(a) Mc is ES congugate m(e)- symmetric and totally e-umbilic with constant 
m(e)- curvature k^lQ, and Ta^niu) = on Mc. 



Proof. We first prove (i) =^ (ii). By the definition (1281) we have 

daBliu) = rW^i?M + rW'^i?., = koBliu), daB^,iu) = r^^)^!?,, + Ti-'^'B,, = loBc^^iu) 
^ T'^ai{n) = kogai>{u), r(l(w) = 0, T^;^^{u)=logab{u), r(;i)(w) = 0. 
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On the other hand 
and hence 

^L'H^) = -h9ab{u) = ^H^liu), H^^^^iu) = -logabiu), T_(n) = T[J - = 

=^ = kolo{9ad9bc - 9ac9bd)- 

We next prove (ii) =^ (i). By the definitions ( l26l) and ( 1271 ) we have 

^ItdV) = ^H^^^'^{gad9bc - 9ac9bd) 

^ ei/(i)2 = eHi'^Hi'^g-^ = hk, 9^» = 1 

^ = Hi'^gM = -VW^\9abiu), h[,1\u) = -y^W^\gab{u). 

On the ther hand 

= dagU^) = ril(w) + ri;i)(«), r,«,(«) = r^^) - rii = o, 

and hence 

rii(w) = o, rLP(w) = o 

^ da{B'M - ^\eKk\0\u)) = 0, da{B^,{u) - v/|Mo/e|r7.(n)) = 
This completes the proof of the theorem. □ 



Figure 2 illustrates the relations among several notions in Theorems 5.1 and 5.2. 
From Theorems 3.2 and 5.2, the dual quadric hypersurface is conformally m(e)-fiat, 
and we obtain the following result as to its dual structure. 

Theorem 5.3. When Mc is an m- dimensional dual quadric hypersurface, it is conformally 
m{e)-flat by the log gauge function s{u) = logi^(M) and the {—l)-affine coordinate system 
u°', a = 1, . . . ,m, satisfying 

daSb-Til'^"sc-SaSb = kologab, = i^{u) D^\r]iiu) - r]f) , (29) 
where r]^, D""^ (a = 1, . . . , m, i = 1, . . . , m + 1) are constants and rank D"* = m. 
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ES conjugate m(e) - symmetric totally e-umbilic 



dual quadric hypersurface constant m(e) - curvature 
Figure 2: Relations among several notions on 



The 1-affine coordinate system Va, two potential functions ip{v) and (f){u) of Mc are 
respectively given as 

V-a = daH^), U^ = d^^{v), (^(n) = ^, tlj{v) + ^{u)-V-aU^ = 0, (30) 
g--\v) = d'^d^iv), g-,i{u) = d-ad-.m- (31) 



Proof. We first prove ( 1291) . As noted in the proof of Theorem 3.1, the partial differential 
equation for s{u) = logu^u) is 

d s.- r^'^^'s -s s.- ^ R^^^^ 

(^ai'b I- ab — ^ _ ^-n-afe • 

When Mc is an m-dimensional dual quadric hypersurface, from 

^itcd = kokigadgbc - gacQbd) rIT^ = Ritcla'"^ = hh{m - i)gbc, 

we have the first relation of (1291) . 

The partial differential equation for u = u{u) is given as 

= BlBlBlf^^^iu) + -g,cBld-aBl = 
^ BIlB^.B^rii;^ + g,ABl] = 0, f i'^^ = z.[ri,,^) + g^cSb + gbcSa] 

When Mc is a dual quadric hypersurface, we can directly show that the above is satisfied 
by the second relation of (129|) . 
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We next prove (|30|) . (31). By the direct calculation we can confirm 

9abiu) = dadi(j){u), (f){u) = YT~^ 

KqLq 

and then the others are immediately obtained. 

This completes the proof of the theorem. □ 



We next consider a conformal transformation i— )■ Mg by the gauge function z/(ty) > 
0. As shown by ([H]), the a-connection in terms of the t/;-coordinate system is changed into 

Then we can express the change of quantities related to M^, that is, the (— l)-connection 
of Mc, the 1-ES curvature of Mc and the (— 1)-ES curvature of A{u) are respectively 
changed into 

^i'.c'' = '^[^^c^ + 9caS, + gc,Sa], = ^1 ' 9a,S.], H[-'J = yH[-^J . (32) 

It is also seen that 

k'^1 = ^k'L Kl2.{u) = H^-g^H!P, (33) 

and we call K^j^^ the conformal 1-ES curvature tensor. 
Note that the change of the (— l)-connection 

f (-1)7 _ ^(-1)7 , p , p 

induces the projective transformation at the same time, which implies that the mixture 
geodesic is preserved under the transformation (cf. Schouten (1954), p. 287). The effect of 
constant m(e)-curvature is given in the following theorem. 

Theorem 5.4. Suppose that a curved exponential family Mc is a space of constant m(e)- 
curvature. Then there exists a conformal transformation Mg i— )■ Me one? a coordinate 
system u = {u'^), a = 1, . . . ,m, of Mc such that the followings hold. 

(a) If Mc is totally e-umbilic, then H^^ (u) =0, Vm G Mc- 



Proof. We first prove (i). When Mc is a space of constant m(e)-curvature, from Theorem 
3.2, Mc is conformally m(e)-flat, so that we have 

r[1^}{u) = ^ 3u{u) >0, u = (u^) such that r^^_^^(M) = 0, Vm G Mc- 

We next prove (ii). Let us take Sk{u) = H^\u) (see Okamoto, Amari and Takeuchi 
(1991)). Then for the totally e-umbilic Mc, from (j^, ^ and ([22]) we have 

H^i^) = K^iu) = uK^l{u) =0,\fueMc^ H^(v) =0,We Mc. 

□ 
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6 Sequential estimation in curved exponential family 

We consider sequential estimations in an (n, m)-curved exponential family Mc. Let A' > 
be a parameter cotroUing the average sample size, and let 1/(77) > (^('"^) > 0) be a smooth 
gauge function defined on Mg in the 77-(w-)coordinate system. 

We denote by Xf — X{t)/t the sample mean up to time t. It has the same value in 
the r^-coordinate system, and its value in the ui-coordinate system is denoted by Wt ~ 
{ut,Vt) = ri~^{Xt). The random stopping time r is assumed to satisfy (see Okamoto, 
Amari and Takeuchi (1991)) 

e = Op{l), E^[s] = 0(1), E^[t] = Ku{u), KM = 0{K). 

The term c is due to the bias of Wr from the true w = (u, 0), which is obtained by the 
requirement Eu[t] = Ku{u). The term e includes a rounding error and the "overshooting" 
at the stopping time r. 

We cite the established results concerning the asymptotics of sequential estimators of 
u from Okamoto, Amari and Takeuchi (1991). 

Proposition 6.1. For a consistent sequential estimator u of u, the following relations 
hold. 

(i) The estimator u is first- order efficient, that is, V Ki/(u—u) — >■ A'"(0, g'^{u)) as K ^ oo, 
if and only if A — {A{u)} is an orthogonal ancillary family. 

(a) The bias- corrected estimator u* of u is given by 

~ 2KU ^ ^ ~ +(>aSp + 0f3Sa. (34) 

(Hi) The asymptotic covariance of u* is given by 

(35) 

where 

/'p(-l)N2a6 _ /^(-l)^2 ac bd f'^(-l)^2 _ 'p(-l) 'y^-^) a<^e 'p(-l) _ p(-l) , , . „ 

V^Mc J — l^Mc )cd9 9 ) y'-Mc Jab— cda efb 9 9 ) abc — abc ^ 9caSb ^ 9cbSa, 

\-"-Mc) ~ K-"-Mc)cd9 9 ■, \-"-Mc)ab — ^acK ^bd\9 9 i ^abK ~ ""ab/t 9ab^K, 

(TT{-'^)V2ab _ (tt{-1)V2 ac bd ('^7-(-l)^2 _ tt{-1) tt{-1) Kfj, \u 
) — K^A )cd9 9 ) [J^A Jab — -"kAo ^ fiub 9 9 ■ 

Based on Theorem 5.4 we obtain the following result for the possibility of covariance 
minimization. 
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Theorem 6.1. Suppose that a curved exponential family is a space of constant m(e)- 
curvature and is totally e-umblic. Then there exists a conformal transformation Mf. ^ Mg 
and a coordinate system u = (u"), a = 1, . . . ,m, of such that the following holds for 
the maximum likelihood estimator Uj^ie of u"" without bias- correction: 

m^iCie - ^')(SL - ^')] = + 0{K-'). (36) 

When Mc itself is a f.r.m. exponential family, holds by l[21\) given in Theorem 4-i- 
When Mc is a dual quadric hypersurface, [3^) holds by (d^j given in Theorem 5.3. 



Proof. Since h[^J = holds for the maximum hkehhood estimator (m.l.e.), from (|3 
and Theorem 5.4, we have for the bias of the m.l.e. 

bmle - ~ >-al3 9 - ' be 9 - -^kA 9 - - ^ Ic 9 " ^ n\ 9 " U, 

and the expression (15^ is derived. 

When Mc itself is a f.r.m. exponential family with expectation parameter m, the partial 
differential equations for s{u) = logi^(u) and for u = u{u) are 

as noted in Theorem 4.1. When Mc is a dual quadric hypersurface, the partial differential 
equations for s{u) = logi^(u) and for u = u{u) are 

as noted in Theorem 5.3. 

This completes the proof of the theorem. □ 



7 Examples 

7.1 von Mises- Fisher model 

This is an (m + 1, m)-curved exponential family, of which density functions with respect to 
the invariant measure on the m-dimensional unit sphere under rotational transformations 
are given by (cf. Barndorff-Nielsen et al (1989), p. 76) 

fc{x, 1, u) = exp{e{u) ■ X - ^{9{u))}, 9-x = d\i + 6^X2 + ■■■ + e^'+^x^+i, 

e = ri= (rf ), e G = {e G M'"+' 1^-^ = 1}, x = (x.) G 

m = - log a„(r), l/a„(r) = (27r)('"+i)/V(i-™)/^/(^_i)/2(r), r > 0, 

where /(m-i)/2('") is the modified Bessel function of the first kind and of order (m — l)/2. 
We assume that the concentration parameter r is assumed to be a given positive constant. 
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The parametric representations 9 = 9{u) and t] = i]{u) are given by 



9\u) 



r cosu 

r sin cos u"^ 
r sin sin u"^ cos 



?7i(m) = r'^ cosM^ 



V2[U) 



sinw^ cosM^ 



rjs^u) = r sin li^ sin cos 
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m+l I 



U 



r sin sin ■ ■ ■ sin ^sinw™' 77^+1 (m) = r''' sin sin ■■■ sin ^sinw™, 



where < u^, . . . ,u™- ^ < n, < u'^ < 2tt andr^ = -d\ogam{r) / dr = I^rn+i)/2ir)/ I{m~i)/2ir). 
Note that E[x] = r^^, and r'^ is a strictly increasing function of r which maps (0, 00) onto 
(0,1). 

From these representations the tangent vectors B\{u) and Bai{u) can be calculated, 
and then the unit normal vectors B\{u) and Bi^i{u) (k = m + 1) are derived from the 
relations B]^{u)Bai{u) = and Bf^i{u)B\{u) = as follows. 



Bl{u) 
Bl{u) 
Bliu) 



COSM 



sin-u-*^ cosM^ 



sin sin v? cos 



Buiiu) = COSM^ 

Bk2{u) = sin M-*^ COS -u^ 
B^siu) = sin sin cos 



B" 



m+l I 



U 



sinw^ sinw^ 



■ sin u"^ ^ sin u"^ 



The above expressions show that 



B. 



m+l[U) 



1 



sin sin u"^ ■ 



sinw™ ^sinti™. 



B\{u) = -9\u), B,,iu) 

f f I 

and so this model is a dual quadric hypersurface. From Theorem 5.2 we also see that this 
model is ES conjugate m(e)-symmetric and totally e-umblic with constant m(e)-curvature 
l/(rr"'') > 0. The related geometrical quantities are given below. 



9ab[U) 



sin^ u° 



^abba ^ " 



c=l 



■^9ab[Uj, 

f I 

1 



1, 



■-gab{u) 

r 



(u) = -^gaa{u)gbb{u), a^b, = .^ 9ab{u). 



From Theorem 3.2 (i) this Mc is conformally m(e)-flat, so that there exist a gauge 

such that f^;!^^) = 



function z/(m) > and a (— l)-affine coordinate system u 

0, Vm e Mc. As given by ( !29l) . the partial differential equation for s{u) = logi^(u) is 



abc 



daSb - r. 



ab 



SaSb 



i9aby 



of which one solution is 



iy{u) 



and then = u{u)D"-^rii{u). 



21 



7.2 Hyperboloid model 

This is an (m + l,m)-curved exponential family, of which density functions with respect 
to the invariant measure on the m-dimensional unit hyperboloid under hyperbolic trans- 
formations are given by (cf. Barndorff-Nielsen et al. (1989), p. 104) 

fc{x, 1, u) = exp{e{u) ■ X - ip{e{u))}, = -r^\ d' = rf , i = 2, . . . , m + 1, 

e = (e)ei/- = {eeR"^+Me*e = l, e>0}, x = {x,)eH"^, 

m--^ogam{r), l/a„(r) = 2(27r)('"-^)/V^-"^)/2^(^_i)/2(r), r > 0, 

where K(^rn-i)/2{r) is the modified Bessel function of the third kind and of order (m — 1)/2. 
We assume that the concentration parameter r is assumed to be a given positive constant. 
The parametric representations — 0{u) and r} — r}{u) are given by 



9^{u) — — rcoshrt^ Vii"^) — r^coshrt^ 

6^ {u) = r sinh cos u"^ 772 {u) = sinh cos u 
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9^ {u) — r sinh sin cos rj^ {u) = r ^ sinh sin cos 

9"^~^^{u) = r sinhw"^ sinw^ • • • sinw"*"^ sinw"* r]m+i{u) = r''' sinhw^ sinw^ • • • sinw"*"^ sinw"*, 

where -u^ G M, < < tt, < < 2% and r'^ = d\ogam{r)/dr = 

K(rn+i)/2{r) / K{ra-i)/2{T). Note that E[x\ = r^^, and is a strictly decreasing function of 
r which maps (0, 00) onto (1, 00). 

From these representations the tangent vectors Bl[u) and Bai{u) can be calculated, 
and then the unit normal vectors B^^{u) and Bi^i{u) {k = m + 1) are derived from the 
relations B^^{u)Bai{u) = and Bi^i{u)B\{u) = as follows. 

B\{u) = coshw^ Bi^i{u) = cosh?7,^ 

B'^iu) = — sinh cos -6^2 («) = sinh -u-*^ cos -u^ 
Bl{u) = — sinh sin cos B^y,{u) = sinh sin cos 

S^+^(ii) = — sinhu-*^ sinw^ • • • smu^~^ sinu^ B^ m+i{u) — smh.v} sinv? ■ ■ -svau^^^ sin 
The above expressions show that 

Bliu) = -\9\u), B^iiu) = ^77iH, 

and again this model is a dual quadric hypersurface. From Theorem 5.2 we also see that 
this model is ES conjugate m(e)-symmetric and totally e-umblic with constant m(e)- 



22 



curvature — l/(rr^) < 0. The related geometrical quantities are given below. 

a 

gn{u) = rr\ gab{u) = 5abrr^ sinh^ Y\ sin^ u'^~^, sin^ = 1, a = 2, 

c=2 



Kbbai^) = -—^9aaiu)gbbiu), a ^ b, >{u) = — p^a&(M). 

From Theorem 3.2 (i) this M^. is conformally m(e)-fiat, so that there exist a gauge 
function v{u) > and a (— l)-affine coordinate system u = (m") such that T^^^\u) = 
0, Wu G Mc- As given by (!29|) . the partial differential equation for s{u) = logu^u) is 

a -p(-i)c 

OaSb - Sc - SaSb = -^Qab, 



of which one solution is 

z/(m) 

and then vf^ = v{u)D'^'^rii[u). 



sinh I I 



7.3 Numerical results 

We examine our theoretical results numerically by using the von Mises-Fisher and the 
hyperboloid models. We take 10 kinds of number N (nonsequential case) and K (se- 
quential case) of observations, and for each or we generate 500 random simulated 
data. Then the empirical means of covariances i?emp[(^m«e~^o)(''^rn«e~^o)] (nonsequential 

case) and Eemp[{u^mie ~ ^o){^mie ~ ^o)] (sequential case) of the m.l.e. over this 500 sample 
size are used for evaluation, where Uq and Uq denote the true values of and m". The 
stopping times r for the sequential estimations are determined by (see Okamoto, Amari 
and Takeuchi (1991)) 

r = inf {t I - —dadbl{x,t,Umle)g"'^{Umle) > Ku{Umle) + c|, 

I m J 

\ f m 1\ 1/m 1\ 

c = — - — 7 ) : von Mises-Fisher c = — r ) : hyperboloid. 

2 \tt^ J 2 V J 

As for the von Mises-Fisher model, numerical results are based on the following set of 
values 

m = 2, r = 0.25, (m^, u^) = (vr/G, 7r/3), 

jjai^^ai^ i.e., = i/(m)?7i(m), = u{u)r]2{u) , z/(m) = 1/(1 sinw^ll sinw^l). 
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and for the hyperboloid model, numerical results are based on the following set of values 
m = 2, r = 0.1, {ul,ul) = {0.1, 71 / 3), 

j^ai ^ s^i/iQQ^ i_e., = u{u)rii{u)/100, = v{u)^2{u)|\00, v{u) = 1/(1 sinhii^ll sinu^l). 

Figures 3-8 show the von Mises-Fisher model, and Figures 9-14 show the hyperboloid 
model. The notations in the figures indicate the following quantities. 

• nonsequential case 
OCOVah = NE,^,[{uZe - <){u*Ze - a, 6 = 1, 2 

OCRBab = g^'^uo), 

OALBab = g'^^uo) + ^{^(ri^.'Vo))^"^ + (< W)^""} 

■ sequential case 

CCOVab = Eemp{T)Eemp[{Kiie " 'i^o)(C/e " a,b = 1,2 

CCRBab = g^~\uo), 

MST — EempiT) '■ empirical mean of r 

SDST — yJVempiT) '■ empirical standard deviation of r. 

Wc sec that in the nonsequential case OCOVab approach to the asymptotic lower 
bound OALBab exhibiting the differential geometrical loss OALBab — OCRBab, and in 
the sequential case CCOVab nearly attain the Cramer-Rao lower bound CCRBab as if 
the model were a f.r.m. exponential family. Figures 8, 14 confirm that the assumptions 
MST ^0{K), SDST = 0{^/K) are satisfied in each model. 
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OALBM 
OCRBH 



OALB22 
OCRB22 



gure 3: OCOVll von Mises-Fisher Figure 4: OCOV22 von Mises-Fisher 



_o a 



-73 cr - 



gure 5: CCOFll von Mises-Fisher Figure 6: CCOV22 von Mises-Fisher 




gure 7: CC0V12 von Mises-Fisher Figure 8: MST SDST von Mises-Fisher 
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Figure 9: OCOVll hyperboloid 
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Figure 10: OCOV22 hyperboloid 
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Figure 11: CCOVll hyperboloid 



Figure 12: CCOV22 hyperboloid 




500 1000 1500 2000 2500 



Figure 13: CC0V12 hyperboloid Figure 14: MST SDST hyperboloid 
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8 Discussion 



We have analyzed sequential estimation procedures in terms of the conformal geometry 
of statistical manifolds. We have also constructed a concrete procedure for the covariance 
mininization in a multidimensional curved exponential family M^. The method is divided 
into two separate stages: one is to choose a stopping rule which is effective for reducing the 
1-ES curvature if^^ and the other is to choose a gauge function u(u) on Mc effective for 

reducing the (— l)-connection Another typical choice of u{u) is the one effective for 

the covariance stabilization, as suggested in Okamoto, Amari and Takeuchi (1991). These 
choices contradict each other in general multidimensional cases, and this fact reflects the 
difference between the ordinary Riemannian geometry and the mutually dual geometry 
as exhibited in several geometrical notions introduced in this paper. 

The present method is also applicable to investigating sequential testing procedures. 
The geometrical theory of higher-order asymptotics of testing hypothesis in nonsequential 
case was developed by Kumon and Amari (1983) and Amari (1985). The main results are 
summarized as follows. 

The power function Prit) of a test T is expanded as 

Prit) - Priit) + PT2{t)/^+PT3{t)/N + 0(iV-3/2), 

where N denotes the number of observations, and t/yfN indicates the geodesic distance 
between the null hypothesis and the point in the alternative hypothesis. 

(i) The first-order power function Pti(^) and the second-order power function PT2{t) are 
maximized uniformly in t if and only if the ancillary family (boundaries of the critical 
region) associated with a test T is asymptotically an orthogonal family. 

(ii) The third-order power loss function APr3(^) = sup2.Pr3(^) — Prait) is expressed as 
the weighted sum of two kinds of the square of the 1-ES curvatures H^^^, the square of the 
(— 1)-ES mixture curvature H^^^ of the associated ancillary family, and also the square 
of the (— l)-mixture connection F^^^ (when there are unknown nuisance parameters). 

Based on these nonsequential results, wc can utilize the conformal geometry to the 
analysis and the construction of most powerful sequential tests. Specifically when a statis- 
tical manifold is a f.r.m. exponential family or a dual quadric hypersurface, it is expected 
that one can design sequential tests without any power loss. This is a subject which will 
be treated in a future work. 
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